Similarity, feature-based generalization and bias in novel onset clusters

نویسنده

  • Adam Albright
چکیده

The past decade has seen significant advances in modeling gradient preferences of English speakers using grammars with probabilistic rules, constraint rankings, or weights inferred from the statistical properties of the lexicon (Coleman and Pierrehumbert 1997; Treiman, Kessler, Knewasser, Tincoff, and Bowman 2000; Frisch, Large, and Pisoni 2000; Bailey and Hahn 2001; Hammond 2004; Hayes and Wilson, in press; and many others). In study after study, where lexical counts have shown probabilistic differences between different cluster types, experiments with native speakers have shown corresponding behavior preferences. At the same, studies investigating substantive biases in phonology have begun to focus on cases in which speakers’ preferences do not appear to follow straightforwardly from lexical statistics (Moreton 2002, 2007; Wilson 2003, 2006; Zhang and Lai 2006; Berent et al, in press). For example, Berent et al. (in press) show that English speakers prefer initial #bn over #bd, even though few or no English words begin with either cluster. These results show that speaker preferences cannot be guided solely by obvious lexical statistics such as literal biphone frequency. However, they leave open the question of whether such preferences might be learned by some less direct means, perhaps involving similarity, phonological features, or some other mechanism for generalizing from attested to unattested sequeces. In this talk, I report on some attempts to take on this question, exploring the capacity of similarity-based and natural class-based models to predict generalization to novel clusters.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BotOnus: an online unsupervised method for Botnet detection

Botnets are recognized as one of the most dangerous threats to the Internet infrastructure. They are used for malicious activities such as launching distributed denial of service attacks, sending spam, and leaking personal information. Existing botnet detection methods produce a number of good ideas, but they are far from complete yet, since most of them cannot detect botnets in an early stage ...

متن کامل

A Novel Architecture for Detecting Phishing Webpages using Cost-based Feature Selection

Phishing is one of the luring techniques used to exploit personal information. A phishing webpage detection system (PWDS) extracts features to determine whether it is a phishing webpage or not. Selecting appropriate features improves the performance of PWDS. Performance criteria are detection accuracy and system response time. The major time consumed by PWDS arises from feature extraction that ...

متن کامل

A novel method for detecting structural damage based on data-driven and similarity-based techniques under environmental and operational changes

The applications of time series modeling and statistical similarity methods to structural health monitoring (SHM) provide promising and capable approaches to structural damage detection. The main aim of this article is to propose an efficient univariate similarity method named as Kullback similarity (KS) for identifying the location of damage and estimating the level of damage severity. An impr...

متن کامل

Evaluation of Similarity Measures for Template Matching

Image matching is a critical process in various photogrammetry, computer vision and remote sensing applications such as image registration, 3D model reconstruction, change detection, image fusion, pattern recognition, autonomous navigation, and digital elevation model (DEM) generation and orientation. The primary goal of the image matching process is to establish the correspondence between two ...

متن کامل

Probit-Based Traffic Assignment: A Comparative Study between Link-Based Simulation Algorithm and Path-Based Assignment and Generalization to Random-Coefficient Approach

Probabilistic approach of traffic assignment has been primarily developed to provide a more realistic and flexible theoretical framework to represent traveler’s route choice behavior in a transportation network. The problem of path overlapping in network modelling has been one of the main issues to be tackled. Due to its flexible covariance structure, probit model can adequately address the pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007